home *** CD-ROM | disk | FTP | other *** search
- Subject: v24i051: News/mail gateway package, Part01/04
- Newsgroups: comp.sources.unix
- Approved: rsalz@uunet.UU.NET
- X-Checksum-Snefru: a0337cbd c86bfe66 f571c436 de881e78
-
- Submitted-by: Rich $alz <rsalz@bbn.com>
- Posting-number: Volume 24, Issue 51
- Archive-name: newsgate/part01
-
- This kit provides two programs for "linking" RFC822 Mail messages and
- RFC1076 Usenet News articles. Each half of the conversion is handled by a
- different program, mail2news or news2mail. A few utility programs are
- also included.
-
- With these programs and the right set of mail aliases and news sys and
- active file entries, it is possible to build any set of moderated,
- unmoderated, one-way, or bi-directional gateways between any set of news
- and mail groups and lists that your little heart desires.
-
- #! /bin/sh
- # This is a shell archive. Remove anything before this line, then feed it
- # into a shell via "sh file" or similar. To overwrite existing files,
- # type "sh file -c".
- # The tool that generated this appeared in the comp.sources.unix newsgroup;
- # send mail to comp-sources-unix@uunet.uu.net if you want that tool.
- # Contents: README MANIFEST gag.1 gag.y regex.c
- # Wrapped by rsalz@litchi.bbn.com on Fri Mar 15 16:42:25 1991
- PATH=/bin:/usr/bin:/usr/ucb ; export PATH
- echo If this archive is complete, you will see the following message:
- echo ' "shar: End of archive 1 (of 4)."'
- if test -f 'README' -a "${1}" != "-c" ; then
- echo shar: Will not clobber existing file \"'README'\"
- else
- echo shar: Extracting \"'README'\" \(3833 characters\)
- sed "s/^X//" >'README' <<'END_OF_FILE'
- XIntroduction
- X------------
- XThis kit provides two programs for "linking" RFC822 Mail messages and
- XRFC1076 Usenet News articles. Each half of the conversion is handled by a
- Xdifferent program, mail2news or news2mail. A few utility programs are
- Xalso included.
- X
- XWith these programs and the right set of mail aliases and news sys and
- Xactive file entries, it is possible to build any set of moderated,
- Xunmoderated, one-way, or bi-directional gateways between any set of news
- Xand mail groups and lists that your little heart desires.
- X
- XIf you run a small site with a couple of mostly-local mailing lists, you
- Xprobably don't want to bother setting this stuff up. Instead, either
- Xconvert everything directly to News, or set up local moderated groups. On
- Xthe other hand, if you provide gateway service to the Internet (e.g., UCB)
- Xor large organization, then this stuff is for you. News, especially with
- Xthe proliferation of NNTP (RFC977) and related clients, is generally more
- Xefficient than mail for disk space, CPU cycles, and network usage.
- X
- XThe programs work with Sendmail or MMDF and News 2.11.8 or later. I don't
- Xknow of anything off-hand that would prevent this from working with C news
- X(did Erik ever say thank you, Henry?). We've only run the programs on BSD
- Xhosts, but a start toward a System V port has been made. Who knows, it
- Xmight even work right now.
- X
- XErik Fair <fair@apple.com> wrote the original version of this package a
- Xcouple of years ago as "nrecnews" distributed with 2.11 and as "gateway", a
- Xvery tricky awk/shell/sed script. I got copies, and recoded it all in C.
- XI also completely overhauled nrecnews, changed the names, added some
- Xutility programs, and wrote the documentation. It seems pretty solid now,
- Xand is processing a few hundred messages a week at BBN and elsewhere.
- X
- XInstallation
- X------------
- XMail2news uses the date parser from the 2.11 netnews distribution; edit
- Xthe Makefile to point to where you keep your source. Next, edit gate.h as
- Xappropriate for your system. There are some fairly detailed comments
- Xexplaining the meaning of the #define's, but good luck. Unfortunately,
- Xthis is one of those things that you have to already know a fair bit about
- Xbefore you can use it. As one way up the learning curve, play with the
- Xgag -- the hardest part is almost always getting the right aliases and news
- Xentries set up.
- X
- XThe "signoff" code is pulled out into a separate program and is also part
- Xof mail2news. This is so that a properly paranoid administrator can trap
- Xthe exit code, and forward rejected postings to a human to verify. If
- Xyou trust the heuristics in that program, just turn on the -F flag.
- X
- XAt this point I should be telling you how to set up the different types of
- Xgroups, but I'm too burnt out on this to bother. If someone has some
- Xwords here to get me started, I'd appreciate it.
- X
- XFlaws
- X-----
- XCross-posting and multiple mail recipients are not handled very well.
- XCross-post means each mailing list gets a separate message. Does
- XB news's multi-cast fix this, perhaps?
- X
- XSomeone sending mail to lista and listb means the message only goes into
- Xnews.list.a, or news.list.b, depending on which gateway alias gets it
- Xfirst, which isn't great. One hack, to avoid losing messages, is to munge
- Xthe message-id to put the newsgroup name in there.
- X
- XHow do you bypass the "more included than new text" check in inews?
- X
- XFinally
- X-------
- XI am maintaining this release, but Erik is still involved in the issues
- Xand might be interested in what you have to say. There is a mailing list
- Xfor sites doing this in a big way, primarily on the Internet. Write
- Xto news-n-mail-request@bbn.com to be added. Please don't ask to join
- Xif you're only doing local gateways of a couple of groups.
- X
- XIf you don't send e-mail, then use comp.mail.misc and news.software.b;
- Xcross-post.
- X
- XRich $alz
- Xrsalz@bbn.com
- END_OF_FILE
- if test 3833 -ne `wc -c <'README'`; then
- echo shar: \"'README'\" unpacked with wrong size!
- fi
- # end of 'README'
- fi
- if test -f 'MANIFEST' -a "${1}" != "-c" ; then
- echo shar: Will not clobber existing file \"'MANIFEST'\"
- else
- echo shar: Extracting \"'MANIFEST'\" \(1811 characters\)
- sed "s/^X//" >'MANIFEST' <<'END_OF_FILE'
- X File Name Archive # Description
- X----------------------------------------------------------
- XREADME 1 Installation notes
- XMANIFEST 1 This shipping list
- XMakefile 4 Guess...
- XTODO 4 Stuff I haven't gotten around to yet
- Xgag.1 1 Manual page for gag
- Xgag.y 1 Gateway alias generator program
- Xgate.h 3 Header file, #include'd by everyone
- Xhdr.c 3 Header-cracking and address re-writing routines
- Xlex.l 3 Tokenizer for gag
- Xmail-interface 3 Front-end to MMDF or sendmail
- Xmail2news.1 3 Manual page for mail2news
- Xmail2news.c 2 Main program for mail->news gateway
- Xmisc.c 3 Support routines for both programs
- Xmkmailpost.1 4 Manual page for mkmailpost
- Xmkmailpost.sh 2 Turn active file into "mailpost" commands
- Xnews2mail.1 3 Manual page for news2mail
- Xnews2mail.c 2 Main program for news->mail gateway
- Xnews2mail.sh 4 A script version of news2mail
- Xpatchlevel.h 4 Misteak recorder
- Xregex.3 2 Manual page for regex
- Xregex.c 1 Regular-expression matching routines
- Xrfc822.c 2 Reading and writing mail and news headers
- Xsignoff.1 4 Manual page for signoff
- Xsignoff.c 3 Program to filter out "add me" requests
- Xstr.c 4 String and support routines for mail2news
- Xsysexits.h 3 System exit codes
- Xtest-addr 4 Test cases for address-rewriting routines
- Xtest-gag 4 Test cases for gag program
- Xuucp-2-inet 3 Mapping of UUCP to Internet hostnames
- END_OF_FILE
- if test 1811 -ne `wc -c <'MANIFEST'`; then
- echo shar: \"'MANIFEST'\" unpacked with wrong size!
- fi
- # end of 'MANIFEST'
- fi
- if test -f 'gag.1' -a "${1}" != "-c" ; then
- echo shar: Will not clobber existing file \"'gag.1'\"
- else
- echo shar: Extracting \"'gag.1'\" \(8467 characters\)
- sed "s/^X//" >'gag.1' <<'END_OF_FILE'
- X.\" $Header: /nfs/papaya/u2/rsalz/src/newsgate/src/RCS/gag.1,v 1.5 91/02/12 14:43:31 rsalz Exp $
- X.TH GAG 1 LOCAL
- X.SH NAME
- Xgag \- Gateway alias generator
- X.SH SYNOPSIS
- X.in +.5i
- X.na
- X.ti -.5i
- X.B gag
- X[
- X.B \- a
- X] [
- X.BI \-b bbn_file
- X] [
- X.BI \-d mmdf_dir
- X] [
- X.BI \-s sendmail_file
- X] [
- X.BI \-n news_file
- X] [
- X.BI \-t PP_shell
- X] [
- X.BI \-u PP_user
- X] [
- X.B \-b
- X] [
- X.B \-p
- X] [
- X.I file
- X]
- X.ad
- X.in -.5i
- X.SH DESCRIPTION
- X.I Gag
- Xreads a control file and builds entries for
- X.IR sendmail (8),
- X.IR BBN ,
- X.IR MMDF (8),
- Xand
- X.IR PP (8)
- Xalias files, as well as
- X.IR Netnews (1)
- Xsys file entries.
- XIt is an auxiliary program for helping to maintain a mail/news
- Xgateway operation.
- XA typical
- X.I gag
- Xfile sets up default values for all the parameters, then contains a list
- Xof newsgroup\-mailing list pairs.
- XGroup-specific values can override the defaults.
- X.PP
- X.I Gag
- Xinput is fairly free-form.
- XIt ignores C-style comments, and whitespace is allowed anywhere.
- XThere are a couple of dozen keywords, and case is significant; most
- Xof the keywords are parameters used to control the gatewaying.
- XQuoted strings are written as they are in C, except they may not
- Xspan lines.
- X.PP
- XUsing the parameters enumerated below,
- X.I gag
- Xcan write six different sets of alias files:
- X.RS
- X.ta \w'Sendmail 'u
- X.nf
- XBBN if the ``\-b'' flag is used
- XMMDF if the ``\-m'' flag is used
- Xnews if the ``\-n'' flag is used
- XSendmail if the ``\-s'' flag is used
- XPP shell If the ``\-t'' flag is used
- XPP user if the ``\-u'' flag is used
- X.fi
- X.RE
- XAny combination of these flags is allowed.
- X.PP
- X.I Gag
- Xnormally complains about attempts to gateway groups that are not in
- Xthe news active file; use the ``\-a'' suppresses this check, assuming
- Xthat all named groups are valid.
- X.PP
- XBBN-style
- Xaliases can't pipe into a command with parameters.
- XFor example,
- X.I sendmail
- Xcan have an entry like this in
- X.IR /usr/lib/aliases (5):
- X.RS
- X.ta \w'post-bboard: 'u
- X.nf
- Xpost-bboard: "|/usr/lib/news/mail2news -n bbn.bboard \e
- X -o 'BBN News/Mail Gateway' \e
- X -d bbn"
- X.fi
- X.RE
- XWith BBN, it is necessary to create an alias that feeds into a custom script:
- X.RS
- X.ta \w'bboard-gate: 'u
- X.nf
- Xbboard-gate: @mailer.bbn.com { "news|/usr/lib/news/.admin/gate-bboard" }
- X bboard-gate@mailer.bbn.com
- X.fi
- X.RE
- X.I Gag
- Xwill also create the utility script invoked by the above alias:
- X.RS
- X.ta \w'exec 'u
- X.nf
- X#! /bin/sh
- X## This script is on the "bboard" mailing list.
- Xexec /usr/lib/news/mail2news -n bbn.bboard \e
- X -o "BBN News/Mail Gateway" \e
- X -d bbn
- X.fi
- X.RE
- XTo create these scripts, use the ``\-d'' flag to name the directory where
- Xthey should be created; ``\-d .'' will create them in the current directory.
- X.PP
- X.I MMDF
- Xaliases are similar to
- X.IR sendmail 's,
- Xexcept that the the scripts are run under a specified user id
- X(see below).
- X.PP
- X.I PP
- Xuses two files, a shell and user file, whose format is not described here.
- X.PP
- XSome sites want to create mail aliases that forward into each newsgroup;
- Xthat is, mail sent to ``comp-foo-bar'' should get posted to the ``comp.foo.bar''
- Xnewsgroup.
- XIf the ``\-p'' flag is used, then each gatewayed group will get such an
- Xalias created.
- XThe
- X.IR mkmailpost (1L)
- Xcommand can be used to create
- X.I gag
- X``mailpost'' commands for all entries in the news active file.
- XNote that using
- X.I mkmailpost
- Xand the ``\-p'' flag will almost certainly result in the creation of duplicate
- Xaliases.
- X.PP
- XThe parameters that control the gatewaying are:
- X.IP directory
- XThe directory where the BBN alias scripts are kept.
- X(In the above example, the directory is
- X.IR /usr/lib/news/.admin .)
- X.IP distributions
- XA space-separated list of distributions to forward from news to mail.
- XIn most cases, this will be the full set of distributions received, but
- Xit can be convenient, for example, to not forward regional articles out
- Xto a world-wide mailing list.
- XWith distributions set to ``world usa na'' then the news sys entry
- Xfor ``comp.foo'' will have this in the second field:
- X.in +.5i
- X.nf
- X<site>:\e
- X world,!world.all,usa,!usa.all,na,!na.all,comp.foo,!comp.foo.all\e
- X <rest of entry>
- X.in -.5i
- X.fi
- X.IP "inews flags"
- XThese are flags to pass on to
- X.IR inews (8)
- Xwhen gatewaying a mail message into netnews.
- XThey are put on the
- X.I news2mail
- Xcommand line, which will pass them along.
- XA common use is to set a default distribution or organization.
- X.IP mail2news
- XThe full pathname of the
- X.I mail2news
- Xprogram.
- X.IP mailcontact
- XThis is the name of the person listed in the ``For more information, contact''
- Xpart of the header generated by
- X.IR news2mail.
- X>>WHAT DOES GATEWAY DO??<<
- X.IP mailhost
- XWhen remote mailing lists are gatewayed into local newsgroups, it can
- Xoften be convenient to provide a local alias that forwards on to the
- Xremote host.
- XFor example, if the list ``info-foo'' is maintained at ``vax.host.edu''
- Xthen it is possible to create a local alias that just forwards to
- X``info-foo@vax.host.edu.''
- X.IP "mailinglist"
- XIf set to ``true,'' then
- X.I gag
- Xwill write aliases that forward on to the list at the current mailhost; if
- Xset to ``false'' than no such aliases will be written.
- X.IP moderator
- XIf set, then the value is put on the
- X.I mail2news
- Xcommand line with the ``\-a'' flag.
- XThis is useful for making a semi-moderated one-way gateway.
- X.IP news2mail
- XThe full pathname of the
- X.I news2mail
- Xprogram.
- X.IP organization
- XIf set, then the value is put on the
- X.I mail2news
- Xcommand line with the ``\-o'' flag.
- X.IP owner
- XIf set, then all
- X.I sendmail
- Xaliases also have an ``owner\-'' version, with this value as the
- Xrecipient, to receive trouble reports.
- X.IP requestaddr
- XMost mailing lists have a ``\-request'' address to handle list administration.
- XWhen gatewaying the list ``info-foo'' if the maintenance address isn't
- X``info-foo-request'' set this parameter to the proper address.
- X.IP site
- XThis is the name of the site to use in the news sys file; ``gateway'' is
- Xa common choice.
- X.IP user
- XWhen running scripts under
- X.I MMDF
- X(either normal or with the BBN style)
- Xa userid to setuid to must be given.
- X.PP
- XNote that there is some overlap in these parameters.
- XIn particular the ``moderator'' and ``organization'' parameters can really
- Xbe subsumed by the ``inews flags'' parameter.
- XThey are explicitly called out, however, for convenience in specifying
- Xthe types of gatewaying.
- XFor example, while all gatewayed groups might go out with the same
- Xdistribution, only some might need an
- X.I Approved:
- Xheader line.
- X.SH "THE LANGUAGE"
- XA
- X.I gag
- Xfile is composed of intermixed parts of three different constructs:
- Xdefaults, mail/news posting entries, mail/news gateway entries.
- X.PP
- XIndividual groups can override the default parameters.
- XThe defaults can be changed at any time, and retain the new value for
- Xthe rest of the file or until changed again.
- XFor example, most mail aliases will be ``owned'' by
- X.IR postmaster ,
- Xbut it might be convenient to set the forwarding address on specific
- Xaliases to someone else.
- XFor example:
- X.RS
- X.DT
- X.nf
- X/* Send gateway complaints to postmaster */
- Xdefault owner = "postmaster";
- X
- X/* The "animals" list is privately maintained. */
- Xgateway bbn.animals animal-rights
- X owner = "canus-major";
- X.fi
- X.RE
- XThe syntax is explained in more detail, below.
- X.PP
- XA default parameter assignment looks like one of the following:
- X.RS
- X.DT
- X.nf
- X\fBdefault\fP <parameter> \fB= true ;\fP
- X\fBdefault\fP <parameter> \fB= false ;\fP
- X\fBdefault\fP <parameter> \fB=\fP "string value" \fB;\fP
- X\fBdefault\fP <parameter> \fB= dotify (\fP "value" \fB) ;\fP
- X.fi
- X.RE
- XThe last form is used to translate mixed-case strings into the prefix-period
- Xform used by mail2news, q.v.
- X.PP
- XA gateway declaration looks like this:
- X.RS
- X.DT
- X.nf
- X\fBgateway\fP <newsgroup> <mailinglist>
- X [ parameter settings ] \fB;\fP
- X.fi
- X.RE
- XThe parameter settings are like the default settings shown above, except
- Xthat the word ``default'' is left off, as are all semicolons but the
- Xlast one.
- X.PP
- XTo set up a simple unidirectional newsgroup/mailing list gateway, use a
- X``mailpost'' declaration:
- X.RS
- X.DT
- X.nf
- X\fBmailpost\fP <newsgroup> \fB;\fP
- X.fi
- X.RE
- X.SH "BUGS AND CAVEATS"
- XThis program is a collection of memory leaks; as it runs for a very short
- Xtime that's okay.
- X.PP
- XIn generating news
- X.I sys
- Xfile entries, remember that the command is a
- X.IR sprintf (3)
- Xformat string, and any percent signs
- X.RI ( % )
- Xmust be doubled or they will be taken as a formatting control.
- X.SH "SEE ALSO"
- Xmail2news(1L), mkmailpost(1L), news2mail(1L).
- X.SH AUTHORS
- XRich $alz <rsalz@bbn.com>, replacing some
- X.I m4
- Xscripts that
- X.br
- XErik E. Fair <fair@apple.com> wrote.
- X.br
- XPiete Brooks <pb@computer-lab.cambridge.ac.uk> provided the PP support.
- END_OF_FILE
- if test 8467 -ne `wc -c <'gag.1'`; then
- echo shar: \"'gag.1'\" unpacked with wrong size!
- fi
- # end of 'gag.1'
- fi
- if test -f 'gag.y' -a "${1}" != "-c" ; then
- echo shar: Will not clobber existing file \"'gag.y'\"
- else
- echo shar: Extracting \"'gag.y'\" \(14642 characters\)
- sed "s/^X//" >'gag.y' <<'END_OF_FILE'
- X%{
- X/*
- X** GAG
- X**
- X** Mail/news gateway alias generator.
- X*/
- X#define MAINLINE
- X#include "gate.h"
- X#ifdef RCSID
- Xstatic char RCS[] =
- X "$Header: /nfs/papaya/u2/rsalz/src/newsgate/src/RCS/gag.y,v 1.5 91/02/12 14:43:44 rsalz Exp $";
- X#endif /* RCSID */
- X
- X#define EMPTY(p) ((p) == NULL || (p)[0] == '\0')
- X
- Xchar *Pname; /* Program name */
- X
- Xint Errors; /* Did user screw up? */
- XSTATIC int NoGroupCheck; /* Don't check if valid group? */
- XSTATIC int PostViaMail; /* Make post-news-group alias? */
- XSTATIC char *OutDir; /* Directory for BBN scripts */
- XSTATIC FILE *bbn; /* File for BBN-style aliases */
- XSTATIC FILE *ppuser; /* File for PP user aliases */
- XSTATIC FILE *ppshell; /* File for PP shell aliases */
- XSTATIC FILE *mmdf; /* File for MMDF aliases */
- XSTATIC FILE *sendmail; /* File for Sendmail aliases */
- XSTATIC FILE *news; /* File for sys file entries */
- X
- XSTATIC char *CurDirectory, *DefDirectory;
- XSTATIC char **CurDistribs, **DefDistribs;
- XSTATIC int CurDoMailinglist, DefDoMailinglist;
- XSTATIC char *CurFlags, *DefFlags;
- XSTATIC char *CurMail2news, *DefMail2news;
- XSTATIC char *CurMailcontact, *DefMailcontact;
- XSTATIC char *CurMailhost, *DefMailhost;
- XSTATIC char *CurModerator, *DefModerator;
- XSTATIC char *CurNews2mail, *DefNews2mail;
- XSTATIC char *CurOrganization, *DefOrganization;
- XSTATIC char *CurOwner, *DefOwner;
- XSTATIC char *CurRequestAddr, *DefRequestAddr;
- XSTATIC char *CurSite, *DefSite;
- XSTATIC char *CurUser, *DefUser;
- X
- Xextern time_t time();
- X%}
- X
- X%union {
- X char *String;
- X int Bool;
- X}
- X
- X%token tDEFAULT tDIRECTORY tDISTRIBUTIONS tDOTIFY tFALSE tGATEWAY tFLAGS
- X%token tID tINEWS tMAIL2NEWS tMAILCONTACT tMAILHOST tMAILINGLIST
- X%token tMAILPOST tMODERATOR tNEWS2MAIL tORGANIZATION tOWNER tREQUESTADDR
- X%token tSITE tTRUE tUSER
- X
- X%type <String> tID value
- X%type <Bool> boolean
- X
- X%%
- X
- Xfile : /* NULL */
- X | file block ';'
- X | file default ';'
- X | file mpost ';'
- X | file error ';' {
- X#ifdef lint
- X /* Compulsive... */
- X if (Errors)
- X YYERROR;
- X#endif /* lint */
- X }
- X ;
- X
- Xdefault : tDEFAULT tDIRECTORY value { DefDirectory = $3; }
- X | tDEFAULT tDISTRIBUTIONS value { (void)Split($3, &DefDistribs, '\0'); }
- X | tDEFAULT tINEWS tFLAGS value { DefFlags = $4; }
- X | tDEFAULT tMAIL2NEWS value { DefMail2news = $3; }
- X | tDEFAULT tMAILCONTACT value { DefMailcontact = $3; }
- X | tDEFAULT tMAILHOST value { DefMailhost = $3; }
- X | tDEFAULT tMAILINGLIST boolean { DefDoMailinglist = $3; }
- X | tDEFAULT tMODERATOR value { DefModerator = $3; }
- X | tDEFAULT tNEWS2MAIL value { DefNews2mail = $3; }
- X | tDEFAULT tORGANIZATION value { DefOrganization = $3; }
- X | tDEFAULT tOWNER value { DefOwner = $3; }
- X | tDEFAULT tREQUESTADDR value { DefRequestAddr = $3; }
- X | tDEFAULT tSITE value { DefSite = $3; }
- X | tDEFAULT tUSER value { DefUser = $3; }
- X ;
- X
- Xblock : op_init tGATEWAY tID tID op_set {
- X if (!Errors && ValidNewsgroup($3))
- X WriteOne($3, $4);
- X }
- X ;
- X
- Xmpost : op_init tMAILPOST tID op_set {
- X char *GroupasMail;
- X
- X if (!Errors && ValidNewsgroup($3)) {
- X GroupasMail = Dot2Dash($3);
- X if (bbn)
- X BBNpostviamail($3, GroupasMail);
- X if (sendmail)
- X Fprintf(sendmail, "%s: \"|%s -n %s\"\n",
- X GroupasMail, CurMail2news, $3);
- X if (mmdf)
- X Fprintf(mmdf, "%s: \"%s|%s -n %s\"\n",
- X GroupasMail, CurUser, CurMail2news, $3);
- X if (ppuser)
- X Fprintf(ppuser, "%s:shell\n", GroupasMail);
- X if (ppshell)
- X Fprintf(ppshell, "%s:%s,,%s -n %s\n",
- X GroupasMail, CurUser, CurMail2news, $3);
- X }
- X }
- X ;
- X
- Xop_init : /* NULL */ {
- X CurDoMailinglist = DefDoMailinglist;
- X CurDistribs = DefDistribs;
- X CurDirectory = DefDirectory;
- X CurFlags = DefFlags;
- X CurMail2news = DefMail2news;
- X CurMailhost = DefMailhost;
- X CurMailcontact = DefMailcontact;
- X CurModerator = DefModerator;
- X CurNews2mail = DefNews2mail;
- X CurOrganization = DefOrganization;
- X CurOwner = DefOwner;
- X CurRequestAddr = DefRequestAddr;
- X CurSite = DefSite;
- X CurUser = DefUser;
- X }
- X ;
- X
- Xop_set : /* NULL */
- X | an_opt op_set
- X ;
- X
- Xan_opt : tDIRECTORY value { CurDirectory = $2; }
- X | tDISTRIBUTIONS value { (void)Split($2, &CurDistribs, '\0'); }
- X | tINEWS tFLAGS value { CurFlags = $3; }
- X | tMAIL2NEWS value { CurMail2news = $2; }
- X | tMAILCONTACT value { CurMailcontact = $2; }
- X | tMAILHOST value { CurMailhost = $2; }
- X | tMAILINGLIST boolean { CurDoMailinglist = $2; }
- X | tMODERATOR value { CurModerator = $2; }
- X | tNEWS2MAIL value { CurNews2mail = $2; }
- X | tORGANIZATION value { CurOrganization = $2; }
- X | tOWNER value { CurOwner = $2; }
- X | tREQUESTADDR value { CurRequestAddr = $2; }
- X | tSITE value { CurSite = $2; }
- X | tUSER value { CurUser = $2; }
- X ;
- X
- Xvalue : '=' tID {
- X $$ = $2;
- X }
- X | '=' tDOTIFY '(' tID ')' {
- X $$ = Dotify($4);
- X }
- X ;
- X
- Xboolean : '=' tTRUE {
- X $$ = TRUE;
- X }
- X | '=' tFALSE {
- X $$ = FALSE;
- X }
- X ;
- X%%
- X
- X
- X/*
- X** Copy the string s turning all '.' into '-'.
- X*/
- XSTATIC char *
- XDot2Dash(s)
- X register char *s;
- X{
- X register char *p;
- X char *save;
- X
- X for (save = p = COPY(s); *s; s++)
- X *p++ = *s == '.' ? '-' : *s;
- X *p = '\0';
- X return save;
- X}
- X
- X
- X/*
- X** Copy the string s putting a '.' before all uppercase letters and '.'.
- X*/
- XSTATIC char *
- XDotify(s)
- X register char *s;
- X{
- X register char *p;
- X char *save;
- X
- X for (save = p = NEW(char, strlen(s) * 2 + 1); *s; *p++ = *s++)
- X if (*s == '.' || isupper(*s))
- X *p++ = '.';
- X *p = '\0';
- X return save;
- X}
- X
- X
- X/*
- X** Check if the newsgroup exists in the ACTIVE file.
- X*/
- XSTATIC int
- XValidNewsgroup(Group)
- X register char *Group;
- X{
- X static char **File;
- X register char **p;
- X register char *q;
- X
- X if (NoGroupCheck)
- X return TRUE;
- X
- X if (File == NULL)
- X /* Read in active file, trim to just the newsgroup names. */
- X for (p = File = ReadFile(ACTIVE); *p; p++)
- X if (q = IDX(*p, ' '))
- X *q = '\0';
- X
- X for (p = File; *p; p++)
- X if (EQ(*p, Group))
- X return TRUE;
- X
- X Fprintf(stderr, "%s: ignoring invalid newsgroup \"%s\".\n", Pname, Group);
- X return FALSE;
- X}
- X
- X
- X/*
- X** Create a BBN alias set so that users can mail into a newsgroup
- X** as if it were a mailing list.
- X*/
- XSTATIC void
- XBBNpostviamail(Ngroup, GroupasMail)
- X char *Ngroup;
- X char *GroupasMail;
- X{
- X register FILE *F;
- X char buff[SM_SIZE];
- X
- X /* Create a post-news-group alias which pipes into
- X * the /bin/dir/post-news-group script. */
- X Fprintf(bbn, "post-%s: @%s { \"%s|%s/post-%s\" }\n",
- X GroupasMail, CurMailhost, CurUser, CurDirectory, GroupasMail);
- X Fprintf(bbn, "\tpost-%s@%s\n", GroupasMail, CurMailhost);
- X
- X /* Write a post-news-script which calls mail2news. */
- X if (OutDir) {
- X
- X /* Open the file. */
- X (void)sprintf(buff, "%s/post-%s", OutDir, GroupasMail);
- X (void)unlink(buff);
- X if ((F = fopen(buff, "w")) == NULL) {
- X Fprintf(stderr, "%s: Can't open \"%s\" for output, %s.\n",
- X Pname, buff, strerror(errno));
- X exit(1);
- X }
- X
- X /* Write the script. */
- X Fprintf(F, "#! /bin/sh\n");
- X Fprintf(F, "## This script forwards into the \"%s\" newsgroup.\n",
- X Ngroup);
- X Fprintf(F, "exec %s -n %s\n", CurMail2news, Ngroup);
- X
- X /* Close the file. */
- X (void)fclose(F);
- X (void)chmod(buff, 0755);
- X }
- X}
- X
- X
- X/*
- X** Write out one newsgroup/mailing list gatewaying entry. This is where
- X** the real work is done. We do BBN, MMDF, Sendmail, and news/sys file
- X** entries here.
- X*/
- XSTATIC void
- XWriteOne(Ngroup, Mlist)
- X register char *Ngroup;
- X register char *Mlist;
- X{
- X register char **p;
- X register FILE *F;
- X register char *GroupasMail;
- X char buff[SM_SIZE];
- X
- X GroupasMail = Dot2Dash(Ngroup);
- X
- X if (bbn) {
- X /* Create an alias that forwards to the script. */
- X Fprintf(bbn, "\n## Add this to the \"%s\" mailing list.\n", Mlist);
- X Fprintf(bbn, "%s-gate: @%s { \"%s|%s/gate-%s\" }\n",
- X Mlist, CurMailhost, CurUser, CurDirectory, Mlist);
- X Fprintf(bbn, "\t%s-gate@%s\n", Mlist, CurMailhost);
- X
- X /* Write the script. */
- X if (OutDir) {
- X if (IDX(CurOrganization, '"')) {
- X yyerror("Can't have \" in organization name");
- X free(GroupasMail);
- X return;
- X }
- X
- X /* Open the file. */
- X (void)sprintf(buff, "%s/gate-%s", OutDir, Mlist);
- X (void)unlink(buff);
- X if ((F = fopen(buff, "w")) == NULL) {
- X Fprintf(stderr, "%s: Can't open \"%s\" for output, %s.\n",
- X Pname, buff, strerror(errno));
- X exit(1);
- X }
- X
- X /* Write it. */
- X Fprintf(F, "#! /bin/sh\n");
- X Fprintf(F, "## This script is on the \"%s\" mailing list.\n",
- X Mlist);
- X Fprintf(F, "exec %s -n %s", CurMail2news, Ngroup);
- X if (!EMPTY(CurOrganization))
- X fprintf(F, "\\\n\t-o \"%s\"", CurOrganization);
- X if (!EMPTY(CurModerator))
- X Fprintf(F, "\\\n\t-a %s", CurModerator);
- X if (!EMPTY(CurFlags))
- X Fprintf(F, " \\\n\t%s", CurFlags);
- X Fprintf(F, "\n");
- X
- X /* Close it. */
- X (void)fclose(F);
- X (void)chmod(buff, 0755);
- X }
- X
- X if (PostViaMail)
- X BBNpostviamail(Ngroup, GroupasMail);
- X }
- X
- X if (sendmail || mmdf || ppshell || ppuser) {
- X if (IDX(CurOrganization, '\'')) {
- X yyerror("Can't have ' in organization name");
- X free(GroupasMail);
- X return;
- X }
- X /* Does it make sense to do this? */
- X if (!EMPTY(CurModerator) && CurDoMailinglist)
- X Fprintf(stderr, "Warning: group %s is moderated and mailable.\n",
- X Ngroup);
- X }
- X
- X if (sendmail) {
- X Fprintf(sendmail, "\n## %s <==> %s gateway\n", Ngroup, Mlist);
- X Fprintf(sendmail, "%s%s: %s@%s\n",
- X CurDoMailinglist ? "" : "#", Mlist, Mlist, CurMailhost);
- X if (!EMPTY(CurOwner))
- X Fprintf(sendmail, "%sowner-%s: %s\n",
- X CurDoMailinglist ? "" : "#", Mlist, CurOwner);
- X Fprintf(sendmail, "post-%s: \"|%s -n %s", Mlist, CurMail2news, Ngroup);
- X if (!EMPTY(CurOrganization))
- X Fprintf(sendmail, " -o '%s'", CurOrganization);
- X if (!EMPTY(CurFlags))
- X Fprintf(sendmail, " %s", CurFlags);
- X if (!EMPTY(CurModerator))
- X Fprintf(sendmail, " -a %s", CurModerator);
- X Fprintf(sendmail, "\"\n");
- X if (!EMPTY(CurOwner))
- X Fprintf(sendmail, "owner-post-%s: %s\n", Mlist, CurOwner);
- X
- X if (PostViaMail) {
- X Fprintf(sendmail, "%s: \"|%s -n %s\"\n",
- X GroupasMail, CurMail2news, Ngroup);
- X if (!EMPTY(CurOwner))
- X Fprintf(sendmail, "owner-%s: %s\n", GroupasMail, CurOwner);
- X }
- X }
- X
- X if (mmdf) {
- X Fprintf(mmdf, "\n## %s <==> %s gateway\n", Ngroup, Mlist);
- X Fprintf(mmdf, "%s%s: %s@%s\n",
- X CurDoMailinglist ? "" : "#", Mlist, Mlist, CurMailhost);
- X Fprintf(mmdf, "post-%s: \"%s|%s -n %s",
- X Mlist, CurUser, CurMail2news, Ngroup);
- X if (!EMPTY(CurOrganization))
- X Fprintf(mmdf, " -o '%s'", CurOrganization);
- X if (!EMPTY(CurFlags))
- X Fprintf(mmdf, " %s", CurFlags);
- X if (!EMPTY(CurModerator))
- X Fprintf(mmdf, " -a %s", CurModerator);
- X Fprintf(mmdf, "\"\n");
- X
- X if (PostViaMail)
- X Fprintf(mmdf, "%s: \"%s|%s -n %s\"\n",
- X GroupasMail, CurUser, CurMail2news, Ngroup);
- X }
- X
- X if (ppuser) {
- X Fprintf(ppuser, "\n## %s <==> %s gateway\n", Ngroup, Mlist);
- X Fprintf(ppuser, "%s%s:822 %s@%s\n",
- X CurDoMailinglist ? "" : "#", Mlist, Mlist, CurMailhost);
- X Fprintf(ppuser, "post-%s: shell\n",
- X Mlist, CurUser, CurMail2news, Ngroup);
- X if (PostViaMail)
- X Fprintf(ppuser, "%s:shell\n",
- X GroupasMail, CurUser, CurMail2news, Ngroup);
- X }
- X
- X if (ppshell) {
- X Fprintf(ppshell, "\n## %s <==> %s gateway\n", Ngroup, Mlist);
- X Fprintf(ppshell, "post-%s:%s,,%s -n %s",
- X Mlist, CurUser, CurMail2news, Ngroup);
- X if (!EMPTY(CurOrganization))
- X Fprintf(ppshell, " -o '%s'", CurOrganization);
- X if (!EMPTY(CurFlags))
- X Fprintf(ppshell, " %s", CurFlags);
- X if (!EMPTY(CurModerator))
- X Fprintf(ppshell, " -a %s", CurModerator);
- X Fprintf(ppshell, "\n");
- X if (PostViaMail)
- X Fprintf(ppshell, "%s:%s,,%s -n %s\n",
- X GroupasMail, CurUser, CurMail2news, Ngroup);
- X }
- X
- X if (news) {
- X if (CurSite == NULL)
- X Fprintf(stderr, "No site for newsgroup %s\n", Ngroup);
- X else {
- X /* Psuedo-site name and distributions. */
- X Fprintf(news, "%s\\\n :", CurSite);
- X for (p = CurDistribs; *p; p++)
- X Fprintf(news, "%s,!%s.all,", *p, *p);
- X Fprintf(news, "%s,!%s.all\\\n ", Ngroup, Ngroup);
- X
- X /* Command invocation. */
- X if (CurRequestAddr)
- X Fprintf(news, " ::%s %s %s %s %s %%s\n",
- X CurNews2mail, Mlist, CurMailcontact, CurRequestAddr,
- X CurMailhost);
- X else
- X Fprintf(news, " ::%s %s %s %s-request %s %%s\n",
- X CurNews2mail, Mlist, CurMailcontact, CurMailcontact,
- X CurMailhost);
- X }
- X }
- X
- X /* Clean up. */
- X free(GroupasMail);
- X}
- X
- X
- X/*
- X** Open a file, or use - for standard output. Write the prolog.
- X*/
- XSTATIC FILE *
- Xopenfile(name, timestring)
- X char *name;
- X char *timestring;
- X{
- X FILE *F;
- X
- X if (EQ(name, "-"))
- X F = stdout;
- X else if ((F = fopen(name, "w")) == NULL) {
- X Fprintf(stderr, "%s: Can't open \"%s\" for output, %s.\n",
- X Pname, name, strerror(errno));
- X exit(1);
- X }
- X
- X /* Write prolog. */
- X Fprintf(F, "## %s\n", "--START-OF-GATEWAY-OUTPUT-");
- X Fprintf(F, "## Created at %s", timestring);
- X Fprintf(F, "## %s\n",
- X "This section of the file has been built automatically.");
- X Fprintf(F, "## %s\n",
- X "If you make any changes here they will be lost when it is rebuilt.");
- X
- X return F;
- X}
- X
- X
- X/*
- X** Close a file, write the epilog.
- X*/
- XSTATIC void
- Xclosefile(F)
- X FILE *F;
- X{
- X if (F) {
- X Fprintf(bbn, "## %s\n", "--END-OF-GATEWAY-OUTPUT-");
- X if (F != stdout)
- X (void)fclose(F);
- X }
- X}
- X
- X
- XSTATIC void
- XUsage()
- X{
- X Fprintf(stderr, "Usage:\n\t%s %s\n\t\t%s input\n",
- X Pname,
- X "[-b] [-p] [-d dir] [-b bbn] [-m mmdf] [-n news]",
- X "[-s sendmail] [-t PP_shell] [-u PP_user]");
- X exit(1);
- X}
- X
- X
- Xmain(ac, av)
- X int ac;
- X char *av[];
- X{
- X int c;
- X time_t now;
- X char *timestring;
- X
- X /* Set defaults. */
- X Pname = (Pname = RDX(av[0], '/')) ? Pname + 1 : av[0];
- X (void)umask(0);
- X now = time((time_t *)NULL);
- X timestring = ctime(&now);
- X
- X /* Parse JCL. */
- X while ((c = getopt(ac, av, "ab:m:n:ps:t:u:")) != EOF)
- X switch (c) {
- X default:
- X Usage();
- X /* NOTREACHED */
- X case 'a':
- X NoGroupCheck = TRUE;
- X break;
- X case 'b':
- X bbn = openfile(optarg, timestring);
- X break;
- X case 'd':
- X OutDir = optarg;
- X break;
- X case 'm':
- X mmdf = openfile(optarg, timestring);
- X break;
- X case 'n':
- X news = openfile(optarg, timestring);
- X break;
- X case 'p':
- X PostViaMail++;
- X break;
- X case 's':
- X sendmail = openfile(optarg, timestring);
- X break;
- X case 't':
- X ppshell = openfile(optarg, timestring);
- X break;
- X case 'u':
- X ppuser = openfile(optarg, timestring);
- X break;
- X }
- X
- X /* Get input. */
- X ac -= optind;
- X av += optind;
- X if (ac != 0 && ac != 1)
- X Usage();
- X yyopen(av[0]);
- X
- X
- X /* Do the work. */
- X (void)yyparse();
- X
- X /* Close files. */
- X closefile(bbn);
- X closefile(mmdf);
- X closefile(news);
- X closefile(sendmail);
- X closefile(ppuser);
- X closefile(ppshell);
- X
- X /* That's all she wrote... */
- X exit(Errors == 0 ? 0 : 1);
- X /* NOTREACHED */
- X}
- END_OF_FILE
- if test 14642 -ne `wc -c <'gag.y'`; then
- echo shar: \"'gag.y'\" unpacked with wrong size!
- fi
- # end of 'gag.y'
- fi
- if test -f 'regex.c' -a "${1}" != "-c" ; then
- echo shar: Will not clobber existing file \"'regex.c'\"
- else
- echo shar: Extracting \"'regex.c'\" \(19043 characters\)
- sed "s/^X//" >'regex.c' <<'END_OF_FILE'
- X#ifndef lint
- Xstatic char RCS[] =
- X "$Header: /nfs/papaya/u2/rsalz/src/newsgate/src/RCS/regex.c,v 1.3 89/07/10 12:20:12 rsalz Exp $";
- X#endif /* lint */
- X/*
- X * regex - Regular expression pattern matching
- X * and replacement
- X *
- X *
- X * By: Ozan S. Yigit (oz)
- X * Dept. of Computer Science
- X * York University
- X *
- X *
- X * These routines are the PUBLIC DOMAIN equivalents
- X * of regex routines as found in 4.nBSD UN*X, with minor
- X * extensions.
- X *
- X * These routines are derived from various implementations
- X * found in software tools books, and Conroy's grep. They
- X * are NOT derived from licensed/restricted software.
- X * For more interesting/academic/complicated implementations,
- X * see Henry Spencer's regexp routines, or GNU Emacs pattern
- X * matching module.
- X *
- X * Routines:
- X * re_comp: compile a regular expression into
- X * a DFA.
- X *
- X * char *re_comp(s)
- X * char *s;
- X *
- X * re_exec: execute the DFA to match a pattern.
- X *
- X * int re_exec(s)
- X * char *s;
- X *
- X * re_modw change re_exec's understanding of what
- X * a "word" looks like (for \< and \>)
- X * by adding into the hidden word-character
- X * table.
- X *
- X * void re_modw(s)
- X * char *s;
- X *
- X * re_subs: substitute the matched portions in
- X * a new string.
- X *
- X * int re_subs(src, dst)
- X * char *src;
- X * char *dst;
- X *
- X * re_fail: failure routine for re_exec.
- X *
- X * void re_fail(msg, op)
- X * char *msg;
- X * char op;
- X *
- X * Regular Expressions:
- X *
- X * [1] char matches itself, unless it is a special
- X * character (metachar): . \ [ ] * + ^ $
- X *
- X * [2] . matches any character.
- X *
- X * [3] \ matches the character following it, except
- X * when followed by a left or right round bracket,
- X * a digit 1 to 9 or a left or right angle bracket.
- X * (see [7], [8] and [9])
- X * It is used as an escape character for all
- X * other meta-characters, and itself. When used
- X * in a set ([4]), it is treated as an ordinary
- X * character.
- X *
- X * [4] [set] matches one of the characters in the set.
- X * If the first character in the set is "^",
- X * it matches a character NOT in the set. A
- X * shorthand S-E is used to specify a set of
- X * characters S upto E, inclusive. The special
- X * characters "]" and "-" have no special
- X * meaning if they appear as the first chars
- X * in the set.
- X * examples: match:
- X *
- X * [a-z] any lowercase alpha
- X *
- X * [^]-] any char except ] and -
- X *
- X * [^A-Z] any char except uppercase
- X * alpha
- X *
- X * [a-zA-Z] any alpha
- X *
- X * [5] * any regular expression form [1] to [4], followed by
- X * closure char (*) matches zero or more matches of
- X * that form.
- X *
- X * [6] + same as [5], except it matches one or more.
- X *
- X * [7] a regular expression in the form [1] to [10], enclosed
- X * as \(form\) matches what form matches. The enclosure
- X * creates a set of tags, used for [8] and for
- X * pattern substution. The tagged forms are numbered
- X * starting from 1.
- X *
- X * [8] a \ followed by a digit 1 to 9 matches whatever a
- X * previously tagged regular expression ([7]) matched.
- X *
- X * [9] \< a regular expression starting with a \< construct
- X * \> and/or ending with a \> construct, restricts the
- X * pattern matching to the beginning of a word, and/or
- X * the end of a word. A word is defined to be a character
- X * string beginning and/or ending with the characters
- X * A-Z a-z 0-9 and _. It must also be preceded and/or
- X * followed by any character outside those mentioned.
- X *
- X * [10] a composite regular expression xy where x and y
- X * are in the form [1] to [10] matches the longest
- X * match of x followed by a match for y.
- X *
- X * [11] ^ a regular expression starting with a ^ character
- X * $ and/or ending with a $ character, restricts the
- X * pattern matching to the beginning of the line,
- X * or the end of line. [anchors] Elsewhere in the
- X * pattern, ^ and $ are treated as ordinary characters.
- X *
- X *
- X * Acknowledgements:
- X *
- X * HCR's Hugh Redelmeier has been most helpful in various
- X * stages of development. He convinced me to include BOW
- X * and EOW constructs, originally invented by Rob Pike at
- X * the University of Toronto.
- X *
- X * References:
- X * Software tools Kernighan & Plauger
- X * Software tools in Pascal Kernighan & Plauger
- X * Grep [rsx-11 C dist] David Conroy
- X * ed - text editor Un*x Programmer's Manual
- X * Advanced editing on Un*x B. W. Kernighan
- X * RegExp routines Henry Spencer
- X *
- X * Notes:
- X *
- X * This implementation uses a bit-set representation for character
- X * classes for speed and compactness. Each character is represented
- X * by one bit in a 128-bit block. Thus, CCL or NCL always takes a
- X * constant 16 bytes in the internal dfa, and re_exec does a single
- X * bit comparison to locate the character in the set.
- X *
- X * Examples:
- X *
- X * pattern: foo*.*
- X * compile: CHR f CHR o CLO CHR o END CLO ANY END END
- X * matches: fo foo fooo foobar fobar foxx ...
- X *
- X * pattern: fo[ob]a[rz]
- X * compile: CHR f CHR o CCL 2 o b CHR a CCL bitset END
- X * matches: fobar fooar fobaz fooaz
- X *
- X * pattern: foo\\+
- X * compile: CHR f CHR o CHR o CHR \ CLO CHR \ END END
- X * matches: foo\ foo\\ foo\\\ ...
- X *
- X * pattern: \(foo\)[1-3]\1 (same as foo[1-3]foo)
- X * compile: BOT 1 CHR f CHR o CHR o EOT 1 CCL bitset REF 1 END
- X * matches: foo1foo foo2foo foo3foo
- X *
- X * pattern: \(fo.*\)-\1
- X * compile: BOT 1 CHR f CHR o CLO ANY END EOT 1 CHR - REF 1 END
- X * matches: foo-foo fo-fo fob-fob foobar-foobar ...
- X *
- X */
- X
- X#define MAXDFA 1024
- X#define MAXTAG 10
- X
- X#define OKP 1
- X#define NOP 0
- X
- X#define CHR 1
- X#define ANY 2
- X#define CCL 3
- X#define NCL 4
- X#define BOL 5
- X#define EOL 6
- X#define BOT 7
- X#define EOT 8
- X#define BOW 9
- X#define EOW 10
- X#define REF 11
- X#define CLO 12
- X
- X#define END 0
- X
- X/*
- X * The following defines are not meant
- X * to be changeable. They are for readibility
- X * only.
- X *
- X */
- X#define MAXCHR 128
- X#define CHRBIT 8
- X#define BITBLK MAXCHR/CHRBIT
- X#define BLKIND 0170
- X#define BITIND 07
- X
- X#define ASCIIB 0177
- X
- Xtypedef /*unsigned*/ char CHAR;
- X
- Xstatic int tagstk[MAXTAG]; /* subpat tag stack..*/
- Xstatic CHAR dfa[MAXDFA]; /* automaton.. */
- Xstatic int sta = NOP; /* status of lastpat */
- X
- Xstatic CHAR bittab[BITBLK]; /* bit table for CCL */
- X
- Xstatic void
- Xchset(c) register CHAR c; { bittab[((c)&BLKIND)>>3] |= 1<<((c)&BITIND); }
- X
- X#define badpat(x) return(*dfa = END, x)
- X#define store(x) *mp++ = x
- X
- Xchar *
- Xre_comp(pat)
- Xchar *pat;
- X{
- X register char *p; /* pattern pointer */
- X register CHAR *mp=dfa; /* dfa pointer */
- X register CHAR *lp; /* saved pointer.. */
- X register CHAR *sp=dfa; /* another one.. */
- X
- X register int tagi = 0; /* tag stack index */
- X register int tagc = 1; /* actual tag count */
- X
- X register int n;
- X int c1, c2;
- X
- X if (!pat || !*pat)
- X if (sta)
- X return(0);
- X else
- X badpat("No previous regular expression");
- X sta = NOP;
- X
- X for (p = pat; *p; p++) {
- X lp = mp;
- X switch(*p) {
- X
- X case '.': /* match any char.. */
- X store(ANY);
- X break;
- X
- X case '^': /* match beginning.. */
- X if (p == pat)
- X store(BOL);
- X else {
- X store(CHR);
- X store(*p);
- X }
- X break;
- X
- X case '$': /* match endofline.. */
- X if (!*(p+1))
- X store(EOL);
- X else {
- X store(CHR);
- X store(*p);
- X }
- X break;
- X
- X case '[': /* match char class..*/
- X
- X if (*++p == '^') {
- X store(NCL);
- X p++;
- X }
- X else
- X store(CCL);
- X
- X if (*p == '-') /* real dash */
- X chset(*p++);
- X if (*p == ']') /* real brac */
- X chset(*p++);
- X while (*p && *p != ']') {
- X if (*p == '-' && *(p+1) && *(p+1) != ']') {
- X p++;
- X c1 = *(p-2) + 1;
- X c2 = *p++;
- X while (c1 <= c2)
- X chset(c1++);
- X }
- X#ifdef EXTEND
- X else if (*p == '\\' && *(p+1)) {
- X p++;
- X chset(*p++);
- X }
- X#endif
- X else
- X chset(*p++);
- X }
- X if (!*p)
- X badpat("Missing ]");
- X
- X for (n = 0; n < BITBLK; bittab[n++] = (char) 0)
- X store(bittab[n]);
- X
- X break;
- X
- X case '*': /* match 0 or more.. */
- X case '+': /* match 1 or more.. */
- X if (p == pat)
- X badpat("Empty closure");
- X lp = sp; /* previous opcode */
- X if (*lp == CLO) /* equivalence.. */
- X break;
- X switch(*lp) {
- X
- X case BOL:
- X case BOT:
- X case EOT:
- X case BOW:
- X case EOW:
- X case REF:
- X badpat("Illegal closure");
- X default:
- X break;
- X }
- X
- X if (*p == '+')
- X for (sp = mp; lp < sp; lp++)
- X store(*lp);
- X
- X store(END);
- X store(END);
- X sp = mp;
- X while (--mp > lp)
- X *mp = mp[-1];
- X store(CLO);
- X mp = sp;
- X break;
- X
- X case '\\': /* tags, backrefs .. */
- X switch(*++p) {
- X
- X case '(':
- X if (tagc < MAXTAG) {
- X tagstk[++tagi] = tagc;
- X store(BOT);
- X store(tagc++);
- X }
- X else
- X badpat("Too many \\(\\) pairs");
- X break;
- X case ')':
- X if (*sp == BOT)
- X badpat("Null pattern inside \\(\\)");
- X if (tagi > 0) {
- X store(EOT);
- X store(tagstk[tagi--]);
- X }
- X else
- X badpat("Unmatched \\)");
- X break;
- X case '<':
- X store(BOW);
- X break;
- X case '>':
- X if (*sp == BOW)
- X badpat("Null pattern inside \\<\\>");
- X store(EOW);
- X break;
- X case '1':
- X case '2':
- X case '3':
- X case '4':
- X case '5':
- X case '6':
- X case '7':
- X case '8':
- X case '9':
- X n = *p-'0';
- X if (tagi > 0 && tagstk[tagi] == n)
- X badpat("Cyclical reference");
- X if (tagc > n) {
- X store(REF);
- X store(n);
- X }
- X else
- X badpat("Undetermined reference");
- X break;
- X#ifdef EXTEND
- X case 'b':
- X store(CHR);
- X store('\b');
- X break;
- X case 'n':
- X store(CHR);
- X store('\n');
- X break;
- X case 'f':
- X store(CHR);
- X store('\f');
- X break;
- X case 'r':
- X store(CHR);
- X store('\r');
- X break;
- X case 't':
- X store(CHR);
- X store('\t');
- X break;
- X#endif
- X default:
- X store(CHR);
- X store(*p);
- X }
- X break;
- X
- X default : /* an ordinary char */
- X store(CHR);
- X store(*p);
- X break;
- X }
- X sp = lp;
- X }
- X if (tagi > 0)
- X badpat("Unmatched \\(");
- X store(END);
- X sta = OKP;
- X return(0);
- X}
- X
- X
- Xstatic char *bol;
- Xstatic char *bopat[MAXTAG];
- Xstatic char *eopat[MAXTAG];
- Xchar *pmatch();
- X
- X/*
- X * re_exec:
- X * execute dfa to find a match.
- X *
- X * special cases: (dfa[0])
- X * BOL
- X * Match only once, starting from the
- X * beginning.
- X * CHR
- X * First locate the character without
- X * calling pmatch, and if found, call
- X * pmatch for the remaining string.
- X * END
- X * re_comp failed, poor luser did not
- X * check for it. Fail fast.
- X *
- X * If a match is found, bopat[0] and eopat[0] are set
- X * to the beginning and the end of the matched fragment,
- X * respectively.
- X *
- X */
- X
- Xint
- Xre_exec(lp)
- Xregister char *lp;
- X{
- X register char c;
- X register char *ep = 0;
- X register CHAR *ap = dfa;
- X
- X bol = lp;
- X
- X bopat[0] = 0;
- X bopat[1] = 0;
- X bopat[2] = 0;
- X bopat[3] = 0;
- X bopat[4] = 0;
- X bopat[5] = 0;
- X bopat[6] = 0;
- X bopat[7] = 0;
- X bopat[8] = 0;
- X bopat[9] = 0;
- X
- X switch(*ap) {
- X
- X case BOL: /* anchored: match from BOL only */
- X ep = pmatch(lp,ap);
- X break;
- X case CHR: /* ordinary char: locate it fast */
- X c = *(ap+1);
- X while (*lp && *lp != c)
- X lp++;
- X if (!*lp) /* if EOS, fail, else fall thru. */
- X return(0);
- X default: /* regular matching all the way. */
- X while (*lp) {
- X if ((ep = pmatch(lp,ap)))
- X break;
- X lp++;
- X }
- X break;
- X case END: /* munged automaton. fail always */
- X return(0);
- X }
- X if (!ep)
- X return(0);
- X
- X bopat[0] = lp;
- X eopat[0] = ep;
- X return(1);
- X}
- X
- X/*
- X * pmatch:
- X * internal routine for the hard part
- X *
- X * This code is mostly snarfed from an early
- X * grep written by David Conroy. The backref and
- X * tag stuff, and various other mods are by oZ.
- X *
- X * special cases: (dfa[n], dfa[n+1])
- X * CLO ANY
- X * We KNOW ".*" will match ANYTHING
- X * upto the end of line. Thus, go to
- X * the end of line straight, without
- X * calling pmatch recursively. As in
- X * the other closure cases, the remaining
- X * pattern must be matched by moving
- X * backwards on the string recursively,
- X * to find a match for xy (x is ".*" and
- X * y is the remaining pattern) where
- X * the match satisfies the LONGEST match
- X * for x followed by a match for y.
- X * CLO CHR
- X * We can again scan the string forward
- X * for the single char without recursion,
- X * and at the point of failure, we execute
- X * the remaining dfa recursively, as
- X * described above.
- X *
- X * At the end of a successful match, bopat[n] and eopat[n]
- X * are set to the beginning and end of subpatterns matched
- X * by tagged expressions (n = 1 to 9).
- X *
- X */
- X
- Xextern void re_fail();
- X
- X/*
- X * character classification table for word boundary
- X * operators BOW and EOW. the reason for not using
- X * ctype macros is that we can let the user add into
- X * our own table. see re_modw. This table is not in
- X * the bitset form, since we may wish to extend it
- X * in the future for other character classifications.
- X *
- X * TRUE for 0-9 A-Z a-z _
- X */
- Xstatic char chrtyp[MAXCHR] = {
- X 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
- X 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
- X 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
- X 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
- X 0, 0, 0, 0, 0, 0, 0, 0, 1, 1,
- X 1, 1, 1, 1, 1, 1, 1, 1, 0, 0,
- X 0, 0, 0, 0, 0, 1, 1, 1, 1, 1,
- X 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
- X 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
- X 1, 0, 0, 0, 0, 1, 0, 1, 1, 1,
- X 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
- X 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
- X 1, 1, 1, 0, 0, 0, 0, 0
- X };
- X
- X#define inascii(x) (0177&(x))
- X#define iswordc(x) chrtyp[inascii(x)]
- X#define isinset(x,y) ((x)[((y)&BLKIND)>>3] & (1<<((y)&BITIND)))
- X
- X/*
- X * skip values for CLO XXX to skip past the closure
- X *
- X */
- X
- X#define ANYSKIP 2 /* CLO ANY END ... */
- X#define CHRSKIP 3 /* CLO CHR chr END ... */
- X#define CCLSKIP 18 /* CLO CCL 16bytes END ... */
- X
- Xstatic char *
- Xpmatch(lp, ap)
- Xregister char *lp;
- Xregister CHAR *ap;
- X{
- X register char *e; /* extra pointer for CLO */
- X register char *bp; /* beginning of subpat.. */
- X register char *ep; /* ending of subpat.. */
- X register int op, c, n;
- X char *are; /* to save the line ptr. */
- X
- X while ((op = *ap++) != END)
- X switch(op) {
- X
- X case CHR:
- X if (*lp++ != *ap++)
- X return(0);
- X break;
- X case ANY:
- X if (!*lp++)
- X return(0);
- X break;
- X case CCL:
- X c = *lp++;
- X if (!isinset(ap,c))
- X return(0);
- X ap += BITBLK;
- X break;
- X case NCL:
- X c = *lp++;
- X if (isinset(ap,c))
- X return(0);
- X ap += BITBLK;
- X break;
- X case BOL:
- X if (lp != bol)
- X return(0);
- X break;
- X case EOL:
- X if (*lp)
- X return(0);
- X break;
- X case BOT:
- X bopat[*ap++] = lp;
- X break;
- X case EOT:
- X eopat[*ap++] = lp;
- X break;
- X case BOW:
- X if (!(lp!=bol && iswordc(lp[-1])) && iswordc(*lp))
- X break;
- X return(0);
- X case EOW:
- X if ((lp!=bol && iswordc(lp[-1])) && !iswordc(*lp))
- X break;
- X return(0);
- X case REF:
- X n = *ap++;
- X bp = bopat[n];
- X ep = eopat[n];
- X while (bp < ep)
- X if (*bp++ != *lp++)
- X return(0);
- X break;
- X case CLO:
- X are = lp;
- X switch(*ap) {
- X
- X case ANY:
- X while (*lp)
- X lp++;
- X n = ANYSKIP;
- X break;
- X case CHR:
- X c = *(ap+1);
- X while (*lp && c == *lp)
- X lp++;
- X n = CHRSKIP;
- X break;
- X case CCL:
- X case NCL:
- X while (*lp && (e = pmatch(lp, ap)))
- X lp = e;
- X n = CCLSKIP;
- X break;
- X default:
- X re_fail("closure: bad dfa.", *ap);
- X return(0);
- X }
- X
- X ap += n;
- X
- X while (lp >= are) {
- X if (e = pmatch(lp, ap))
- X return(e);
- X --lp;
- X }
- X return(0);
- X default:
- X re_fail("re_exec: bad dfa.", op);
- X return(0);
- X }
- X return(lp);
- X}
- X
- X/*
- X * re_modw:
- X * add new characters into the word table to
- X * change the re_exec's understanding of what
- X * a word should look like. Note that we only
- X * accept additions into the word definition.
- X *
- X * If the string parameter is 0 or null string,
- X * the table is reset back to the default, which
- X * contains A-Z a-z 0-9 _. [We use the compact
- X * bitset representation for the default table]
- X *
- X */
- X
- Xstatic char deftab[16] = {
- X 0, 0, 0, 0, 0, 0, 377, 003, 376, 377, 377, 207,
- X 376, 377, 377, 007
- X};
- X
- Xvoid
- Xre_modw(s)
- Xregister char *s;
- X{
- X register int i;
- X
- X if (!s || !*s) {
- X for (i = 0; i < MAXCHR; i++)
- X if (!isinset(deftab,i))
- X iswordc(i) = 0;
- X }
- X else
- X while(*s)
- X iswordc(*s++) = 1;
- X}
- X
- X/*
- X * re_subs:
- X * substitute the matched portions of the src in
- X * dst.
- X *
- X * & substitute the entire matched pattern.
- X *
- X * \digit substitute a subpattern, with the given
- X * tag number. Tags are numbered from 1 to
- X * 9. If the particular tagged subpattern
- X * does not exist, null is substituted.
- X *
- X */
- Xint
- Xre_subs(src, dst)
- Xregister char *src;
- Xregister char *dst;
- X{
- X register char c;
- X register int pin;
- X register char *bp;
- X register char *ep;
- X
- X if (!*src || !bopat[0])
- X return(0);
- X
- X while (c = *src++) {
- X switch(c) {
- X
- X case '&':
- X pin = 0;
- X break;
- X
- X case '\\':
- X c = *src++;
- X if (c >= '0' && c <= '9') {
- X pin = c - '0';
- X break;
- X }
- X
- X default:
- X *dst++ = c;
- X continue;
- X }
- X
- X if ((bp = bopat[pin]) && (ep = eopat[pin])) {
- X while (*bp && bp < ep)
- X *dst++ = *bp++;
- X if (bp < ep)
- X return(0);
- X }
- X }
- X *dst = (char) 0;
- X return(1);
- X}
- X
- X#ifdef DEBUG
- X/*
- X * symbolic - produce a symbolic dump of the
- X * dfa
- X */
- Xsymbolic(s)
- Xchar *s;
- X{
- X (void)printf("pattern: %s\n", s);
- X (void)printf("dfacode:\n");
- X dfadump(dfa);
- X}
- X
- Xstatic
- Xdfadump(ap)
- XCHAR *ap;
- X{
- X register int n;
- X
- X while (*ap != END)
- X switch(*ap++) {
- X case CLO:
- X (void)printf("CLOSURE");
- X dfadump(ap);
- X switch(*ap) {
- X case CHR:
- X n = CHRSKIP;
- X break;
- X case ANY:
- X n = ANYSKIP;
- X break;
- X case CCL:
- X case NCL:
- X n = CCLSKIP;
- X break;
- X }
- X ap += n;
- X break;
- X case CHR:
- X (void)printf("\tCHR %c\n",*ap++);
- X break;
- X case ANY:
- X (void)printf("\tANY .\n");
- X break;
- X case BOL:
- X (void)printf("\tBOL -\n");
- X break;
- X case EOL:
- X (void)printf("\tEOL -\n");
- X break;
- X case BOT:
- X (void)printf("BOT: %d\n",*ap++);
- X break;
- X case EOT:
- X (void)printf("EOT: %d\n",*ap++);
- X break;
- X case BOW:
- X (void)printf("BOW\n");
- X break;
- X case EOW:
- X (void)printf("EOW\n");
- X break;
- X case REF:
- X (void)printf("REF: %d\n",*ap++);
- X break;
- X case CCL:
- X (void)printf("\tCCL [");
- X for (n = 0; n < MAXCHR; n++)
- X if (isinset(ap,(CHAR)n))
- X (void)printf("%c",n);
- X (void)printf("]\n");
- X ap += BITBLK;
- X break;
- X case NCL:
- X (void)printf("\tNCL [");
- X for (n = 0; n < MAXCHR; n++)
- X if (isinset(ap,(CHAR)n))
- X (void)printf("%c",n);
- X (void)printf("]\n");
- X ap += BITBLK;
- X break;
- X default:
- X (void)printf("bad dfa. opcode %o\n", ap[-1]);
- X exit(1);
- X break;
- X }
- X}
- X#endif
- END_OF_FILE
- if test 19043 -ne `wc -c <'regex.c'`; then
- echo shar: \"'regex.c'\" unpacked with wrong size!
- fi
- # end of 'regex.c'
- fi
- echo shar: End of archive 1 \(of 4\).
- cp /dev/null ark1isdone
- MISSING=""
- for I in 1 2 3 4 ; do
- if test ! -f ark${I}isdone ; then
- MISSING="${MISSING} ${I}"
- fi
- done
- if test "${MISSING}" = "" ; then
- echo You have unpacked all 4 archives.
- rm -f ark[1-9]isdone
- else
- echo You still must unpack the following archives:
- echo " " ${MISSING}
- fi
- exit 0
- exit 0 # Just in case...
-